Developer Toolbox v5.0 CD:
HTML (client) / HTTP (server) Issues

TOP | Pheedbak | Tree | Topic | A-Z | Search | Hot | NewSilicon Graphics, Inc.


November, 1995: Although we are now into the v5.1 world of the DT, this document is still being included because some of its pieces continue to be relevant and continue to be referred to in other documents.


This document describes the HTML (client) and HTTP (server) issues relating to the operation of this, version 5.0, Developer Toolbox CD. The areas discussed are:

I. Environments in which v5.0 will operate:

  1. an HTTP server exists and is accessible. In this case you can either: and then configure the CGI scripts for your specific system environment.
  2. an HTTP server does not exist. In this case,

II. The benefits of having an HTTP server running in your local machine's environment.

III. Known Bugs and Limitations of this specific netscape 1.0 Browser:

  1. DO NOT ATTEMPT TO ACCESS compressed PostScript OR showcase FILE LINKS
  2. On non-network-accessible systems AVOID DTjanitor@sgi.com or e-mail the janitor links
  3. If not connected to the Internet, Netscape buttons/pull-downs which won't work
  4. netscape "Known Problems and Workarounds" Release Notes

IV. Defining netscape's "SOCKS Host:" and "Port:" fields to match your environment's firewall

V. World Wide Web.

  1. Currently not, but want to find out about Getting Connected to the Internet and the Web.
  2. Cannot connect to the Web so read about the 2 or 3 types of links which won't work.


Introduction

Welcome to the "first cut" of next step in the evolution of the Developer Toolbox. This Toolbox is the first to begin fully employing the power afforded with Hyper Text Markup Language (HTML) functionality, where "hyper-links" or "linked-text" (as well as other linked-data, i.e. images, sound files, movie clips, etc.) provides any document file with an enormously useful "cross-reference" capability, which in turn gives documents a multi-dimensional depth and texture not possible with paper and books.

All HTML-written documents are interpreted by and presented inside the sort of client "browser" you are now reading this document with. For all such documents to be "served up" with 100% operability, one must have an Hyper Text Transfer Protocol (HTTP) server running and accessible on the network one's machine is on. With an HTTP server running, one can make requests through the client browser, which are then processed and responded to on the server side of this paradigm. A Web Terminology Glossary is available for those unfamiliar with HTML and HTTP terminology.

The janitor is very pleased to have on-CD copy of all SGI WebFORCE(TM) technical information pages -- as well as Irix 5.3 WebFORCE(TM) product descriptions and information regarding local inst images. With respect to WebFORCE(TM) technical information, the pertinent topic areas are:

  1. Welcome to the World Wide Web, formerly known as the Internet;
  2. Connecting to the Internet and the Web for those not currently "hooked-up";
  3. About Netscape Navigator (this browser you are currently running);
  4. About Silicon Surf(SM), Silicon Graphics' presence on the World Wide Web;
  5. WebFORCE(TM) introduction, describing SGI's WebFORCE(TM) software and hardware products.

The following two sections describe how to "hook-up" the contents of the v5.0 Toolbox CD to an HTTP server if one has access to such a server in one's own local network environment. For those who just "stepped into this document" at this location, be sure to see the overview information at the top to help you understand all the HTML and HTTP issues relating to the operation of this, v5.0 Toolbox. The following two sections,

  1. ln -s /CDROM /Your/DocumentRoot/toolbox,
  2. rcp -pv -r /CDROM /Your/DocumentRoot/toolbox
are predicated on the fact that an HTTP server exists and is accessible. If this is not the case, you can obtain NCSA's public domain HTTP server, if you have ftp Internet access. Otherwise you need to know about the 3 types of links which won't work through this interface as it is currently configured on the v5.0 CD itself.


Linking /CDROM to /Your/DocumentRoot/toolbox

The janitor recommends the following steps as one of two methods to establish a 100%-functional "HTML-ized Toolbox v5.0" for those already running an HTTP server in their environment.

This method, of creating a symlink from /CDROM to /Your/DocumentRoot/toolbox, is for those people who:

  1. don't want (or aren't able) to use up 380+MB of local disk space needed to copy the Toolbox onto their own system disk(s), and
  2. are able to have a CDROM drive be dedicated to housing the v5.0 Toolbox CD (until v5.1 comes out later this spring).
The following steps will create a fully-functional local "Toolbox Web Site" on the system you have your HTTP server running on:

  1. create a symlinked toolbox directory as a child of the Document Root directory.
    Create a symlink in the HTTP server's Document Root directory pointing to /CDROM (MEDIAD(1M)'s default mount point for CDs)--or wherever else you are mounting CDs at--and call this link "toolbox":

    Then, you should be able to call up the URL like so:

  2. tell the HTTP server about the location of the toolbox/www/cgi-bin scripts.
    You need to tell the HTTP server you are running about the location of the cgi-bin scripts directory resident on the Toolbox. The scripts and program in this directory process all the "search" and "pheedbak" links, as well as the Forms (e.g "Generate") buttons. The janitor has so far only figured out how to do this with Netscape Communications' Netsite Communications and Commerce Servers, and with the NCSA HTTPD Version 1.3 Server. (If anyone knows how this is done with other servers, please e-mail the janitor an equivalent description to the 2 methods below and he will add it to this description.)

    1. Netsite: get into the server as admin via something like:

        netscape http://yourServerName/admin/cgi-bin.html

      and, underneath Add a New CGI Directory, put in (in bold) the following:

      Map URL prefix:  /toolbox/www/cgi-bin
      
      To directory:    /AbsolutePath/to/DocRoot/toolbox/www/cgi-bin
      
      After successfully adding this, restart your server and yer all set.

    2. NSCA HTTPD: let's say your DocumentRoot is /usr/local/www; then add an additional ScriptAlias line to conf/srm.conf that reads:

      ScriptAlias /toolbox/www/cgi-bin/ /usr/local/www/toolbox/www/cgi-bin/

      Running /etc/killall httpd followed by a something like,

        /usr/etc/httpd -d /usr/local/www -f /usr/local/www/conf/httpd.conf

      will now add /toolbox/www/cgi-bin/ to your httpd server's group of cgi-bin definitions. You can have up to 20 such concurrently-defined definitions, provided no first argument following "ScriptAlias" is the same as any other.

      After adding this, restart your server and yer all set.

  3. install the oasisIII inst image so you can perform searches throughout the Toolbox.
    A set of inst files is located in the searchtools/dist subdirectory, containing the oasisIII software needed for conducting searches of the contents of the Toolbox. A listing contains all OasisIII installation details, and a separate document describes OasisIII Operability Issues.

Copying /CDROM to /Your/DocumentRoot/toolbox

The janitor recommends the following steps as one of two methods to establish a 100%-functional "HTML-ized Toolbox v5.0" for those already running an HTTP server in their environment.

This method, of creating a copy of the contents of /CDROM that lives under /Your/DocumentRoot/toolbox, is for those people who:

  1. Are able to dedicate the 380+MB of local disk space needed to copy the Toolbox onto their own system disk(s), and
  2. Are not able to have a CDROM drive be dedicated to housing the v5.0 Toolbox CD (until v5.1 comes out later this spring).
The following steps will create a fully-functional local "Toolbox Web Site" on the system you have your HTTP server running on:
  1. copy a subset of /CDROM to a child directory on your Document Root called toolbox.
    The files on the v5.0 Toolbox CDROM fall into two distinct categories:
    1. Toolbox-specific files needed for complete v5.0 Toolbox operability, and
    2. SGI inst images not needed for v5.0 operability.
    The Toolbox-specific files comprise the files you will need to copy to the toolbox child directory of your Document Root, while the SGI inst image files group will not be required on yer local disk(s). The space the Toolbox-specific files occupy weighs in at about 380+MB.

    The SGI inst images not needed to be copied reside in two sudirectories,

    and their combined size adds up to 76.9MB.

    IF you have 452MB free on your disk, you could do the following:

    ELSE if you have the minimally required 380+MB, you can do

  2. tell the HTTP server about the location of the toolbox/www/cgi-bin scripts.
    You need to tell the HTTP server you are running about the location of the cgi-bin scripts directory resident on the Toolbox. The scripts and program in this directory process all the "search" and "pheedbak" links, as well as the Forms (e.g "Generate") buttons. The janitor has so far only figured out how to do this with Netscape Communications' Netsite Communications and Commerce Servers, and with the NCSA HTTPD Version 1.3 Server. (If anyone knows how this is done with other servers, please e-mail the janitor an equivalent description to the 2 methods below and he will add it to this description.)

    1. Netsite: get into the server as admin via something like:

        netscape http://yourServerName/admin/cgi-bin.html

      and, underneath Add a New CGI Directory, put in (in bold) the following:

      Map URL prefix:  /toolbox/www/cgi-bin
      
      To directory:    /AbsolutePath/to/DocRoot/toolbox/www/cgi-bin
      
      After successfully adding this, restart your server and yer all set.

    2. NSCA HTTPD: let's say your DocumentRoot is /usr/local/www; then add an additional ScriptAlias line to conf/srm.conf that reads:

      ScriptAlias /toolbox/www/cgi-bin/ /usr/local/www/toolbox/www/cgi-bin/

      Running /etc/killall httpd followed by a something like,

        /usr/etc/httpd -d /usr/local/www -f /usr/local/www/conf/httpd.conf

      will now add /toolbox/www/cgi-bin/ to your httpd server's group of cgi-bin definitions. You can have up to 20 such concurrently-defined definitions, provided no first argument following "ScriptAlias" is the same as any other.

      After adding this, restart your server and yer all set.

  3. install the oasisIII inst image so you can perform searches throughout the Toolbox.
    A set of inst files is located in the searchtools/dist subdirectory, containing the oasisIII software needed for conducting searches of the contents of the Toolbox. A listing contains all OasisIII installation details, and a separate document describes OasisIII Operability Issues.


Configuring DT CGI scripts For Your Specific Environment

After you've either linked or copied /CDROM to /Your/DocumentRoot/toolbox, you need to create two files, /usr/tmp/.DT_OksvrRoot and /usr/tmp/.DT_DocRootFile, so the CGI scripts in toolbox/www/cgi-bin will work correctly. /usr/tmp/.DT_OksvrRoot tells the oksvr program, plus the osearch-cgi and oretrieve-cgi scripts, the location of the index files it needs to know about, while /usr/tmp/.DT_DocRootFile tells the tar-cgi, tarDFList-cgi, and tarsend-cgi scripts the location of your server's Document Root.

The following four sections present information relevant for environments where Internet Access may or may not exist:

  1. No HTTP Server, But FTP Internet Access Exists
  2. Why Run an HTTP Server? An Explanation Of The Benefits
  3. No HTTP Server, And No FTP Internet Access Exists
    (presents alternatives for the links that won't work)
  4. No Electronic Connection to the Internet
    (points to documentation about getting connected to the Internet)
For those who just "stepped into this document" at this location, be sure to see the overview information at the top to help you understand all the HTML and HTTP issues relating to the operation of this, v5.0 Toolbox.


No HTTP Server, But FTP Internet Access Exists

For people who don't have an httpd server running but do have ftp Internet access, NCSA's public domain HTTP server (as of March, 1995, the current release is 1.3) is avialable via ftp at ftp.ncsa.uiuc.edu:~ftp/Web/httpd/Unix/ncsa_httpd/httpd_1.3R. Doing a list in this directory (as of March 18, 1995) shows:
-rwxr-xr-x   1 11113    729        50823 Feb 20 13:30 httpd_ascii_docs.tar.Z
-rwxr-xr-x   1 11113    729       309775 Mar  1 18:35 httpd_decaxp.tar.Z
-rwxr-xr-x   1 11113    729      1177671 Feb 20 13:29 httpd_docs.tar.Z
-rwxr-xr-x   1 11113    729       437151 Feb 20 13:25 httpd_hp.tar.Z
-rwxr-xr-x   1 11113    729        89045 Feb 20 13:30 httpd_postscript_docs.tar.Z
-rwxr-xr-x   1 11113    729       329551 Feb 20 13:20 httpd_rs6000.tar.Z
-rwxr-xr-x   1 11113    729       447290 Feb 20 13:33 httpd_sgi.tar.Z
-rwxr-xr-x   1 11113    729       108663 Mar  9 11:51 httpd_solaris.tar.Z
-rwxr-xr-x   1 11113    729       115211 Feb 20 13:00 httpd_source.tar.Z
-rwxr-xr-x   1 11113    729       222311 Feb 20 13:00 httpd_sun4.tar.Z
One can also obtain precompiled server binaries if you have Web access.

Once you have successfully installed and configured your NCSA HTTP server, you can either

  1. ln -s /CDROM /Your/DocumentRoot/toolbox, or
  2. rcp -pv -r /CDROM /Your/DocumentRoot/toolbox
to create a 100%-functional HTML-ized v5.0 Toolbox.


Why Run an HTTP Server? An Explanation Of The Benefits.

For those who haven't experienced this sort of "web technology" yet, you may be wondering, "yeah but why would I want to get this HTTP server thing? --what does it really provide me with?" These are perfectly valid questions. While the Toolbox janitor is by no means any sort of "expert" with regard to this topic, the following is an attempt to explain some of the benefits running an HTTP server provides. (A local copy of the World Wide Web FAQ is an excellent "jump-off" place to start delving into all this stuff for those interested in learning more.)

No HTTP Server, And No FTP Internet Access Exists

For those who do not have an HTTP Server running in their environment, and do not have FTP Internet Access (i.e. so it is not an option to Unable to locate file The reason this occurs is because all these links require an HTTP server to be running which will execute the scripts these links point to.

ALTERNATIVES AVAILABLE IN LIEU OF THE 3 TYPES OF LINKS:

  1. Alternative to HTML-Search functionality:
    You can run the
    sifttree script in the top-of-tree toolbox directory either from a shell or a dirview window. This script will invoke a motif-based GUI version of the oasisIII program in the same way this script has worked on past Toolboxes. So although you can't access the Search links through this netscape browser, you can still perform all the same kinds of searches through the motif interface that were available in the past. PLUS, the fact is, the HTML-form of oasisIII does not YET have the range of power-and-functionality which the motif-based GUI already affords. In time this will NOT be the case, but for now it is...

  2. Alternative to Pheedbak and
  3. DTjanitor@sgi.com links:
    Both of these links invoke a form (requiring the server) allowing you to create an e-mail message and then send it back to the janitor. But, of course, this can still be done simply by sending e-mail directly to DTjanitor@sgi.com from a shell window. For those without e-mail access, we WANT to "hear" from you about ANYTHING you'd like to tell us about all this stuff. We can also be reached at:

  4. Alternative to [Generate] a compressed tar image buttons:
    These buttons were specifically crafted for use in the DT web sites for people who want to pull anything back across the wire to their own local machine, but, with the CD acting as a local read-only disk, rcp can obviously do just as handy a job if you find you want to copy files from the CD to your own local disk.

    There are two distinct ways one can copy files off to one's own local disks. To identify where on the CD the given page is located, refer to the "Location" text widget line/box at the top of the netscape browser window (between the two lines of buttons). This line can be "edited," as well as being cut-and-paste:

    1. Grab specific files from inside the netscape browser window:
      1. move your MOUSE up to the "Location" text widget
      2. delete the filename at the end of the path
      3. with the Mouse still in the text widget press the Enter key
      4. press LEFTMOUSE on the icon-link to the specific file you want
      5. press LEFTMOUSE on the File pulldown and select the Save As entry
      6. You have three formats-to-save-file-as to choose from:
        1. Text: strips out all HTML-embedded elements, TABS, etc.
        2. Source: good for "preserved HTML", program source files, etc.
        3. PostScript: use to create PostScript-based version of HTML file
      7. Be sure to also specify where to save the file in the selection text widget.

    2. Grab files or whole directories via rcp:
      1. move your MOUSE up to the "Location" text widget
      2. delete the filename at the end of the path
      3. with the Mouse still in the text widget press the Enter key
      4. press LEFTMOUSE to select the entire path from /
      5. in a shell window change directory to where you want to copy to
      6. type "rcp -pv -r", then press the F4 key or MIDDLEMOUSE, and then type dot, `.', and press enter.
        This will copy the entire directory you selected in the "Location" widget. If you wish to just grab files instead, you can do the following with rcp:

        rcp -pv /abs/path/to/current/dir/{filename1,filename2,...,filenameN} .

admittedly, the above "alternatives" are a pretty shitty interim way to "help" people hobble along. all this sort of tripe would be a non-sequitor if the janitor had succeeded in finding a way to include an HTTP server on the Toolbox so EVERYTHING would have simply worked from the very start for EVERYONE. chalk this up to more of the janitor's learning curve. just wait until v5.1 arrives--the janitor expects you will be very pleezed with the results!


No Electronic Connection to the Internet

There are number of Uniform Resource Locators, or URLs, scattered throughout the HTML documents on this Toolbox which you will not be able to access, or have the full-functionality of, if you are not connected to the Internet. (For those not familiar with URLs, see A Beginner's Guide to URLs document in the www subtree.)

The following discussion is written for both those people working in secure environments who will never be able to be connected, as well as those people working in environments where connecting to the Internet is a possibility. However, the last part of this discussion, Getting Connected to the Web will only be useful for people in the latter situation.


The remaing three sections present issues surrounding the operation of the Irix 5.2 netscape program that exists in the top-of-tree directory of this, v5.0, Toolbox.

  1. netscape 1.0 bug: DO NOT ACCESS compressed PostScript OR showcase FILE LINKS
  2. On non-network-accessible systems AVOID DTjanitor@sgi.com or e-mail the janitor links
  3. If not connected to the Internet, Netscape buttons/pull-downs which won't work
For those who just "stepped into this document" at this location, be sure to see the overview information at the top to help you understand all the HTML and HTTP issues relating to the operation of this, v5.0 Toolbox.


Netscape 1.0 Known Bug Concerning "Content Encoding:"

There is a known bug in version 1.0 of netscape: it does not support parsing of the "Content-encoding:" MIME type. What this means is the compressed PostScript and showcase file links throughout HTML documents on the Toolbox will not work if accessed through netscape. What is supposed to happen when you access a link pointing to the compressed file is, A.) the file is first uncompressed, and B.) the PostScript or showcase viewer on your local machine is then automatically spawned so you can view the given uncompressed document. For the present time, do not attempt to access the compressed-version Postscript or showcase links--just stick with the uncompressed versions and you'll be able to view any-and-all of these files.

Another point to note: when you access a link to a compressed file like PostScript or showcase, what happens (when things work) is the file is first uncompressed, and a temporary copy of this uncompressed file is created on your local disk (usually in some place like /usr/tmp). The benefit of accessing the uncompressed file link is that no such temporary local copy of said file is created on your local disk. This is a benefit for those who suffer from chronic disk space shortages and can't afford to house [large] files--even just temporary ones--on their own disk(s). Obviously, where viewing a document across the Internet is concerned, there clearly are advantages to first bringing the file across in a compressed form, and then uncompressing it after it's "completed its trip". This was a large motivating factor in crafting the [Generate] a compressed tar image... buttons and accompanying CGI-script functionalities at the bottom of every directory's index.html file page.


"mailto" URL Links Crash netscape On Non-Network Access Systems

If you are running on a system that is not connected to a network, and you have sendmail running, DO NOT TRY TO ACCESS ANY mailto URLs -- i.e., "DTjanitor@sgi.com," "send e-mail to," or "e-mail the janitor" links -- as this will cause netscape to core dump and crash.

While testing this CD on a standalone system not connected to a network and not running sendmail, we discovered a fatal bug relating the the mailto URL which causes netscape to dump core and crash. The sequence goes like this:

  1. access a "DTjanitor@sgi.com," "send e-mail to," or "e-mail the janitor" link
  2. type text in the Subject: and "main area" text widgets
  3. move the Mail window aside
  4. step through 3-or-more links to other documents
  5. go back to the Mail window and press the "Send Mail" button
  6. it may take some minutes, but in time, netscape will core dump and crash.


If not connected to the Internet, some Netscape buttons/pull-downs won't work

For those people running the v5.0 CD in an environment not connected the Internet, a majority of the 2 rows of buttons and pull-down menu entries at the top of the netscape browser window will not work.


Defining netscape's "SOCKS Host:" and "Port:" fields to match your environment's firewall

Configuring your netscape browser to know about your environment's firewall.

Two segments follow: the first contains words-of-advice from one of the webforce-software gauds inside SGI; the second is an excerpt from Netscape Communications Corporation's documentation regarding proxy software/servers.

The janitor was asking for information about the tricky buisness of correctly configuring your access to the internet if your local network environment is behind a firewall. The response included the following helpful advice:

One other thing you may have to worry about is if they have NIS or DNS configured and running, then they may need to have a properly configured /etc/resolv.conf file in order to resolve host names.

A second point is when running socks, you may need a properly configured /etc/socks.conf so that accesses within the network don't use socks, while accesses outside the network do use socks. It also helps to cut down the number of internal sites looking like they came from the socks server.

The following excerpt is included from Netscape Communications Corporation's http://home.netscape.com/newsref/manual/docs/menus.html#C21 "Menu items" page discussing "Mail and Proxies (dialog in Preferences in Options)", to help clarify the issue of correctly defining one's "SOCKS Host:" and "Port:" fields inside the Mail and Proxies window of your Options->Preferences pull-down menu entry.

Ordinarily, the Netscape application does not require proxies to interact with the network services of external sources. However, in some network configurations the connection between the Netscape application and a remote server is blocked by a firewall. Firewalls protect information in internal computer networks from external access. In doing so, firewalls may limit Netscape's ability to exchange information with external sources.

To overcome this limitation, Netscape can interact with proxy software. A proxy server sits atop a firewall and acts as a conduit, providing a specific connection for each network service protocol. If you are running Netscape on an internal network from behind a firewall, you will need to ascertain from your system administrator the names and associated port numbers for the server running proxy software for each network service. Proxy software retains the ability to communicate with external sources, yet is trusted to communicate with the internal network.

A single computer may run multiple servers, each server connection identified with a port number. A proxy server, like an HTTP server or a FTP server, occupies a port. Typically, a connection uses standardized port numbers for each protocol (for example, HTTP = 80 and FTP = 21). However, unlike common server protocols, the proxy server has no default port. Netscape requires that for each proxy server you specify in a Proxy text field, you also specify its port number in the Port field.

Text fields for proxies and ports are offered for FTP (File Transfer Protocol), Gopher, HTTP (HyperText Transfer Protocol), Security (Secure Sockets Layer protocol), WAIS (Wide Area Information System), and SOCKS (firewall bypass software).

The text field No Proxy for: lets you bypass the proxy server for one or more specified local domains. For example, if you specify:

then all HTTP requests for the adomain, bdomain, and netscape.com host servers go from Netscape directly to the host (not using any proxy). All HTTP requests for other servers go from Netscape through the proxy server aserver on port 8080, then to the host. A proxy that runs on a host server outside a firewall cannot connect to server inside the firewall. To bypass the firewall's restriction, you must set the No Proxy for field to include any internal server you're using. If you use local hostnames without the domain name, you should list them the same way. Multiple hostnames are delimited by commas and the wildcard character (*) cannot be used.

See Also: "SOCKS support in the Netscape Navigator" at http://home.netscape.com/assist/support/client/tn/cross-platform/10021.html.


Irix 5.2 netscape "Known Problems and Workarounds" Release Notes

The following hails from chapter 3 of the netscape Irix 5.2 Release Notes. To view the entire document run the CDrelnotes or CDgrelnotes scripts at the top-of-tree.


Copyright © 1995, Silicon Graphics, Inc.